Multi-Party Speech Recovery Exploiting Structured Sparsity Models
نویسندگان
چکیده
We study the sparsity of spectro-temporal representation of speech in reverberant acoustic conditions. This study motivates the use of structured sparsity models for efficient speech recovery. We formulate the underdetermined convolutive speech separation in spectro-temporal domain as the sparse signal recovery where we leverage model-based recovery algorithms. To tackle the ambiguity of the real acoustics, we exploit the Image Model of the enclosures to estimate the room impulse response function through a structured sparsity constraint optimization. The experiments conducted on real data recordings demonstrate the effectiveness of the proposed approach for multi-party speech applications.
منابع مشابه
Structured Sparsity Models for Multiparty Speech Recovery from Reverberant Recordings
We tackle the multi-party speech recovery problem through modeling the acoustic of the reverberant chambers. Our approach exploits structured sparsity models to perform room modeling and speech recovery. We propose a scheme for characterizing the room acoustic from the unknown competing speech sources relying on localization of the early images of the speakers by sparse approximation of the spa...
متن کاملModel-based Sparse Component Analysis for Multiparty Distant Speech Recognition
This thesis takes place in the context of multi-microphone distant speech recognition in multiparty meetings. It addresses the fundamental problem of overlapping speech recognition in reverberant rooms. Motivated from the excellent human hearing performance on such problem, possibly resulting of sparsity of the auditory representation, our work aims at exploiting sparse component analysis in sp...
متن کاملStructured sparse coding for microphone array location calibration
We address the problem of microphone location calibration where the sensor positions have a sparse spatial approximation on a discretized grid. We characterize the microphone signals as a sparse vector represented over a codebook of multi-channel signals where the support of the representation encodes the microphone locations. The codebook is constructed of multi-channel signals obtained by inv...
متن کاملA general framework for multi-channel speech dereverberation exploiting sparsity
We consider the problem of blind multi-channel speech dereverberation without the knowledge of room acoustics. The dereverberated speech component is estimated by subtracting the undesired component, estimated using multi-channel linear prediction (MCLP), from the reference microphone signal. In this paper we present a framework for MCLP-based speech dereverberation by exploiting sparsity in th...
متن کاملFast, Sample-Efficient Algorithms for Structured Phase Retrieval
We consider the problem of recovering a signal x∗ ∈ R, from magnitude-only measurements, yi = |〈ai,x∗〉| for i = {1, 2, . . . ,m}. Also known as the phase retrieval problem, it is a fundamental challenge in nano-, bioand astronomical imaging systems, and speech processing. The problem is ill-posed, and therefore additional assumptions on the signal and/or the measurements are necessary. In this ...
متن کامل